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1.  INTRODUCTION 


The  result.  Let  x,»...,x„  be  r  points  in  R^,  and  A  be 

d  A 

a  class  of  Borel  sets  in  R.  Denote  by  A  (x^a...,x  )  the 

number  of  distinct  sets  in  {{x.» _ ,xjn  A,  AcA).  Define 

i  r 

«A(r)  -  n,ax  d  iA(x. . x  ) . 

X j i •  »  •  iX^eR 

Vapnik  and  Chervonenkis  (1971)  showed  that  either  mA(r)  *  2r 

for  any  positive  integer  r  or  mA(r)  _<  rs+l  »  where  s  is  the 

A  k 

smallest  k  such  that  m  (k)/2  .  A  class  of  sets  A  for  which 
the  latter  case  holds  will  be  called  a  V-C  class  with  index  s 
Suppose  that  p  is  a  probability  measure  on  R^ .  Let 
XiaX2,...  be  a  sequence  of  i.i.d.  random  vectors  with  common 
distribution  p,  and  pn  be  the  empirical  distribution  of 
Xl’**'*Xn*  Denote  *  "distance"  between  pn  and  y  by 

D  (A»u)  *  Sup|p  (A)  -  p ( A) | . 
n  AcA  n 

Throughout  this  paper  we  assume  that  D_(A,p),  s  u  p  |  p  _  ( A )  -  p_  ( A )  | 

n  AcA  n 

and  sup  p_(A)  are  all  random  variables.  We  shall  prove  the 
AcA  n 

fol lowing 

Theorem  1 .  Let  A  be  a  V-C  class  with  index  s  such  that 

Sup  p ( A)  <  6  £  1 /8 .  (1) 

AcA 

Then  for  any  e  >  0  we  have 

P{Dn(A,p)>e)  <  5(2n)Sexp(-nc2/(9l5+4c))  (2) 

+  7( 2n)sexp(-$n/68) 

+  22+sn1+2sexp(-6n/8) , 


2 

provided  n  >_  max  (12 o/e  ,  68(l+sXlog  2)/fi). 

The  proof  of  (2)  is  based  on  an  important  inequality  proved  by  Devroye  and 
Wagner  (1980). 


2.  HISTORICAL  NOTES 


A  few  remarks  concerning  this  inequality  are  in  order.  In  1971,  Vapnik  and 
Chervonenkis  proved  that,  for  any  e  >  0 

P{Dn(A,y)  >  e)  <  4exp(-ne2/8)  EAA(Xj . X2n).  (3) 


This  inequality  is  quite  general  since  no  restrictions  such  as  (1)  are  imposed. 

In  using  this  inequality,  an  extimate  of  m  (n)  must  be  given,  see,  for  example, 

Gaenssler  and  Stute  (1979),  Wenocur  and  Dudley  (1981). 

The  weakness  of  (3)  lies  In  the  fact  that,  in  many  applications  e  *  en-*-0  as 
2 

n-**>.  In  this  case  nen  may  not  tend  to  •  or  tend  to  »  very  slowly.  For  this  reason, 
the  inequality  proved  by  Devroye  and  Wagner  (1980)  is  sometimes  more  useful.  They 
proved  that,  if  supA  y(A)  _<  6  <_  %,  then  for  any  e  >  0 


P{Dp(A,y)  >  e)  <  4mA(2n)exp(-ne2/(646+4e) ) 
+  2P{SupA  y2n(A)  >26} 


(4) 


for  n  >  86/e 


2 


If  we  further  have 


Sup  Sup  | | x-y | J  <  p  <  ® 
AeA  x,yeA 


and 

Sup  y(S(x,p))  <  5  <  Is,  (5) 

xeR^ 


here  ||*||  is  the  L2  or  norm  in  Rd,  and  S(x,p)  is  the  closed  ball  with  radius  p 


centered  at  x,  then 


P{Dn(A,p)>e>  <_  4mA(2n)exp(-ne2/(646+4e)) 
+  4n  exp(-n6/10) 


(6) 


2 

for  n  >_  max  (1/6,86/e  ). 

This  inequality  is  most  useful  when  A  is  the  class  of  balls  with  the  same 
diameter  (norm  L 2  or  Lj.  Otherwise  6  may  be  much  larger  than  Sup^  y(A),  and  (6) 
gives  no  improvement  over  (3).  Chen  and  Zhao  (1984)  made  an  essential  improvement 
in  the  one-dimensional  case: 

Let  A  be  a  class  of  intervals  in  R*,  satisfying  Sup  y(I)  <  6  <  1. 

leA 

Then  there  exists  positive  absolute  constants  such  that  for  any  e  >  0 

P($up|y  (I)  -  y(I )  {  >  e} 

IeA  n  (7) 

£  Cje"  Wn  exp(-C2ne  /6)  +  C3  expt-C^ne), 


provided  n/log  n  >  CQ/e. 

The  proof  of  (7)  relies  on  a  result  concerning  the  strong  approximation  to 
Brownian  bridge  of  the  empirical  process  on  R*.  The  argument  fails  in  the  general 
case  d  >  1.  The  inequality  (2),  to  be  proved  in  the  next  section,  gives  a 
satisfactory  generalization  to  the  case  d  >_  1. 


3.  PROOF  OF  THEOREM  1 


i- 

k‘ 

li 

i 


i 


Set 


j -1 ,2,...,r i 


where  r  will  be  chosen  later.  Then 

6  <  <  &2  <  •  •  •  <  <  26  <_  h 

2  2 

When  n  126/e  we  have  n  >_  86j/e  .  From  (4),  the  definition  of  V-C  class 
and  the  fact  that 


SiipA  p(A)  <  6j  _<  >s. 

It  follows  that 

P{Dn(A,v)>c}  <  4{(2n)S+l}  exp(-ne2/(64«1+4e)) 

+  2P{SupA  M2n(A)  >  26j> 

<  5(2n)sexp(-ne2/(64/?6+4e))  +  2P{D2n(A,p)  >  6j}, 

2 

provided  n  _>  126/e  . 

When  6n  >_  68(l+s)log  2,  we  have  2J“*n  _>  86j/62_j  for  j  *  2,3, ...r.  As  before, 
from  (4)  and  SupA  y(A)  _<  62  <  %,  It  follows  that 

P(Dn(A,p)  >  e>  <  5(2n)Sexp(-ne2/(9l6+4e)) 

+(2-5X2-2n)sexp(-2n62/(6462+461)) 

+  22  P{D  ?  (A,y)  >  6-} , 

22n  2 


^  .x  j  ... 


/  * 
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provided  n  >  max(68(l+s)log2/fi,  12fi/e  ). 

Using  (4)  and  Suj^  y(A)  <  5^  <_  h  repeatedly,  we  obtain 

P{On(A,y)  >  e}  <  5(2n)Sexp(-ne2/(9l6+4e)) 

+  Ijli  2j.5(2j»2n)sexp(-2jn6j/(686j+1)) 

+  2r  P{0  (A,w)  >  O  -  Ji  +  Jo  n  +  Jo  n. 

2'n  r  l  ,n  2,n  3,n 

provided  n  >  max(68(l+s)log2/fi,  12fi/e2). 

It  is  easy  to  see  that 

2Jfij/sj+1  -  2j5»  j=1 . r_1‘ 

Hence  it  follows  from  (8),  (9)  and  21+s  <  e50^68  that 

J2,n  ±  5{2n)S  Ij=l  2(I+s)j.exp(-2Jn«2/(68«j+1)) 
£5(2n)s  Ij,!  (21+S)j  exp(-2jfin/68) 

<  5(2n)s  £®al  exp(-jfin/68) 

=  5(2n)se’5n/68(l-e'6n/68)'1 

<  5(2n)s(l-2“(1+s))"1e'6n/68 
£  7(2n)sexp(-fin/68) , 

where  s  >  1  is  invoked. 


When  fin  >.  68(l+s)log2,  we  have  2rnfir  >2.  By  (3) 


J3,n  1  2r  A( (2r  An)5+1)  exp(-2rn6‘/8) . 


(ID 


Take  r  =  rn  to  be  an  integer  such  that  n/2  <  2r  <  n.  When  6n  >  68(l+s)log2 
2  2  2 

we  have  n  6r  ^  2,  n$r  >  and  n6*  >26.  By  (11)  we  have 


J3,n  -  2n((2n2)S+1)  exp(-n262/16) 
<  4n(2n2)sexp(-<$n/8) . 


Formula  (2)  follows  from  (8),  (10)  and  (12).  The  theorem  is  proved. 


4.  APPLICATIONS 


Theorem  1  has  some  applications  in  strong  convergence  problems  involving 
the  uniform  deviation  between  frequencies  and  probabilities  of  a  class  of  events. 

As  an  example,  we  consider  the  nearest  neighbor  (NN)  density  estimates  proposed 
by  Loftsgarden  and  Quesenberry  (1965).  Suppose  that  X  is  a  Rd-valued  random  vectors 
with  distribution  j i  and  unknown  density  function  f.  The  so  called  NN  estimate  of 
f(x)  has  the  form 

fn(x)  =  k/{n(2an(x))d},  x  =  (x(1),...,x(d))eRd,  (13) 

where  k  =  kn  <  n  is  a  positive  integer  chosen  in  advance,  an(x)  is  the  smallest 
a  >  0  such  that  the  cube  [x-a,x+a]  *  nd_j [x^-a,x^+a)  contains  at  least  k  sample 
points.  As  an  application  of  Theorem  1,  we  prove  a  theorem  about  the  convergence 

/V  A 

rate  of  Sup  .  [ f  C x )  -  f(x)| 
x&R 

In  the  sequel;  we  use  c,  a,  c^,  c2>  ...  for  some  positive  constants  independent 
of  n  and  x.  For  x  =  (x^ ,. . .  ,x^d\eRd,  y  =  (y^  ,. . .  ,y^)eRd,  write 
f  (x)(y-x)  =  £d=1  (y^'-x^),  and  take  [|y  -  x||  =  max1<i<d|y^  -  x^|. 

We  say  that  the  density  function  f  belongs  to  x-class  for  some  Xe(0,2],  if  Xe(0,l]  and 
I f Cy)  -  f ( x ) I  _<  C 1 1 y-x J  | X  for  any  x,yeRd,  or  Xe(l,2]  and,  f  are  bounded  and 

If(y)  -  f(x)  -  f'(x)(y-x)]  <  C||y  -  x||X 

for  any  x,yeR  .  We  have 

Theorem  2.  Suppose  that  f  belongs  to  X-class  for  some  Xe(0,2].  Take  k  *  o(n)  and 

k /_  .  /logn  \{d+x)/(d+3x)  (14) 


where  6  >  0  is  any  given  constant.  Then 


11m  sup{(n/k)x/(d+x)Sup(fn(x)  -  f(x)|>  <  C  a.s.  (15) 

fUoo  ^  " 

To  prove  this  theorem,  we  need  the  following  lemma.  In  the  sequel,  denotes 

the  empirical  measure  of  Xj . XR.  Besides,  a  cube  of  the  form  [x-a,x+a]  is 

called  a  regular  cube. 

Lemma  3.  Let  A  be  a  class  of  regular  cubes  satisfying  the  measurability 
conditions  mentioned  in  paragraph  1  and  the  condition 

^u^AcA  —  *c^n  -  1/®* 

Take  k  =  o(n)  and 

k/„  >  6  (^a£)i/(mD>  (16) 

where  r  >  0  and  3  >  0  is  any  given  constant.  Then 

lim  sup{(£)1+r  sup|u_(A)  -  u ( A) | }  <_  C,  a.s. 

K  AeA  n  1 

Notice  that  A  is  a  V-C  class,  one  can  obtain  Lemma  3  from  Theorem  1  immedi¬ 
ately.  The  proof  is  omitted. 

Proof  of  Theorem  2.  Take  k  *o(n)  and 

k/n  >  8(l09"/n)  (d+i)/(d+3x) 

Put 

V„  >  9;1(k/n)x/,dM) 

q.  ■  9,*.  -  e:19,(k/n)x/(d+l) 


10 


B  =  {x:  f (x)  >  V} 
n  —  n 

where  e^,  02e(O,l)  will  be  chosen  later. 

Let  p(x,a)  and  un(x,a)  be  the  probability  measure  and  empirical  measure  of 
[x-a,x+a]  respectively.  Put  M  =  max(s^P  f(x),l).  We  have 

P(«B  IVx)-  f(ll)i  *  Vi'n  *  Jn  <17> 

n 

where 

*n  1  p<Vb  >  f(x)  +  <■„»• 

"  (18) 

Jn  -  P'UxtB  <  f<x>  -  "„»• 

n 

Thus 

IniP<UxeB/»n<x><bn<x>>>-  <19> 

where 

2bn<x>  ■  {)TOT  (1+V,(x>  )’1>1/d- 

Fix  xeBn  =  {x:  f(x)  >  Vn>.  Take  02  <  1/8,  then  qn/f(x)  <  0^  <  1/8.  Noticing 
Vd+t)  <  1  -  7t/8  for  0  £  t  <  1/8,  we  have 

2b„<x>  i  {STW  »-71n/8f(x»)1/d 

<  (k/nf(x))1/d. 

rx+b  (x) 

u(x,b  (x) )  =  f (t )dt 

"  Jx-bn(x) 


It  follows  that 


1  (2bn(x))af(x)  +  C2(2bn(x))a+A 
■  (2bn(x))df(x)[l  +  C2(2bn(x))X/f(x)] 


<  id- 1  y«*>xis'H7r5T>x/d/f(*>) 


(x+d)/. 

Fix  e2,  take  small  enough  such  that  C20j  - 

<  C2e1X/d(k/n)X/(x+d)  <  |  0‘102(k/n)x/(d+x)  =  |  qn. 


I  V  the"  C2(?TfT7T)1/d 
It  follows  that 


y(x,bn(x))  <  £(1-  |  qn/f(x))  <  k/n. 


and 

£  -  u(x,bn(x))  >  kqn/(2nM). 
Hence,  by  (19)  and  Theorem  1,  we  have 


!n  <  P($up  (un(x,bn(x))  -  y(x,bn(x))  >  kqn/(2nM)> 
n 

_  n(kq„/2nM)2 

-  C5n  {exp(”  91k/n+2kqn/nM)  +  exP(-k/68)> 
where  a  is  a  constant  depending  only  on  d.  In  view  of  (14),  we  have  for  large  n 
Ifl  1  C5na{exp(-e’102M'261+2x/(d+x)logn/4OO) 


+  exp(-k/68)}. 


Take  e.  small  enough,  we  have 


(21) 


In  the  same  way,  we  can  take  and  e2  such  that 

l  Jn  <  * 

By  (17),  (18),  (20)  and  (21),  we  have 

iPCq^Sup  |f„(x)  -  f(x)|  >!}<«. 


By  Borel-Cantelli 's  lemma. 


lim  SupIq^Sup  |f  (x)  -  f(x)|>  <  1  a.s.  (22) 

n  xeB„  n 
n 

Fix  ev  e2,  and  take  2bn  *  C3(k/n)1/(d+X* .  Fix  xeBjj  *  {x:  f(x)  <  Vn>.  With 
small  C3  we  have 

fx+bn 

u(x,b  )  *  f(t)dt 

Jx-bn 

<  (2bn)df(x)  +  C2(2bn)d+X 
-nl0llc3  +  C2C3d+X]  <  k/2n  <  k/n* 

Taking  r  *  A/(d+x)  in  Lemma  3,  we  can  assert  with  probability  one  that,  for  n  large 
enough,  the  inequality 

^n(x,bn)  <  u(x,bn)  +  2Cj(k/n)(d+2x)/(d+x) 

<  k/2n  +  2C1(k/n){d+2x^/(d+x^  <  k/n 

holds  uniformly  for  xeBjj.  By  definition,  for  xeB^, 


14 
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